Method and apparatus of audio/video switching

ABSTRACT

Techniques for switching between different playback modes are described herein. The disclosed techniques include detecting a state of playing a content item using Dynamic Adaptive Streaming over HTTP (DASH); determining whether there is a need of switching between a first playback mode and a second playback mode based on the detected state of playing the content item; determining a segment number of a segment among the plurality of segments currently being played based on a timestamp of the segment in response to determining that there is the need of switching between the first playback mode and the second playback mode; obtaining content of the content item based at least in part on the segment number and a playback mode to be switched to; and playing the content item in a switched playback mode.

CROSS-REFERENCE TO RELATED APPLICATION

The present application claims priority of Chinese patent applicationNo. 201910092650.6, filed on Jan. 30, 2019. The entire disclosure of theabove-identified application is hereby incorporated by reference hereinand made a part of this specification.

BACKGROUND

With the development of the Internet and smart terminals, more and moreusers play streaming media (e.g., audios and videos) using kinds ofsmart terminals, such as mobile phones and computers. Users can obtainstreaming media content from network servers through the smartterminals, and render the streaming media content through the smartterminals.

BRIEF DESCRIPTION OF THE DRAWINGS

The following detailed description may be better understood when read inconjunction with the appended drawings. For purposes of illustration,there are shown in the drawings exemplary embodiments of various aspectsof the disclosure; however, the disclosure is not limited to thespecific methods and instrumentalities disclosed.

FIG. 1 is a schematic diagram illustrating an example computing devicethat may be used in accordance with the present disclosure.

FIG. 2 is a flowchart illustrating an example method for audio/videoswitching in accordance with the present disclosure.

FIG. 3 is a flowchart illustrating another example method of audio/videoswitching in accordance with the present disclosure.

FIG. 4 is a flowchart illustrating another example method of audio/videoswitching in accordance with the present disclosure.

FIG. 5 is a flowchart illustrating another example method of audio/videoswitching in accordance with the present disclosure.

FIG. 6 is a flowchart illustrating another example method of audio/videoswitching in accordance with the present disclosure.

FIG. 7 is a flowchart illustrating another example method of audio/videoswitching in accordance with the present disclosure.

FIG. 8 is a block diagram of program modules of an apparatus foraudio/video switching in accordance with the present disclosure.

DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS

FIG. 1 depicts a computing device that may be used in various aspects,such as services, networks, and/or clients. The computer architectureshown in FIG. 1 shows a conventional server computer, workstation,desktop computer, laptop, tablet, network appliance, PDA, e-reader,digital cellular phone, or other computing nodes, and may be utilized toexecute any aspects of the computers described herein, such as toimplement the methods described herein.

A computing device 20 may include a baseboard, or “motherboard,” whichis a printed circuit board to which a multitude of components or devicesmay be connected by way of a system bus or other electricalcommunication paths. One or more central processing units (CPUs) 22 mayoperate in conjunction with a chipset 24. The CPU(s) 22 may be standardprogrammable processors that perform arithmetic and logical operationsnecessary for the operation of the computing device 20.

The CPU(s) 22 may perform the necessary operations by transitioning fromone discrete physical state to the next through the manipulation ofswitching elements that differentiate between and change these states.Switching elements may generally include electronic circuits thatmaintain one of two binary states, such as flip-flops, and electroniccircuits that provide an output state based on the logical combinationof the states of one or more other switching elements, such as logicgates. These basic switching elements may be combined to create morecomplex logic circuits including registers, adders-subtractors,arithmetic logic units, floating-point units, and the like.

The CPU(s) 22 may be augmented with or replaced by other processingunits, such as GPU(s). The GPU(s) may comprise processing unitsspecialized for but not necessarily limited to highly parallelcomputations, such as graphics and other visualization-relatedprocessing.

A chipset 24 may provide an interface between the CPU(s) 22 and theremainder of the components and devices on the baseboard. The chipset 24may provide an interface to a random access memory (RAM) 26 used as themain memory in the computing device 20. The chipset 24 may furtherprovide an interface to a computer-readable storage medium, such as aread-only memory (ROM) 28 or non-volatile RAM (NVRAM) (not shown), forstoring basic routines that may help to start up the computing device 20and to transfer information between the various components and devices.ROM 28 or NVRAM may also store other software components necessary forthe operation of the computing device 20 in accordance with the aspectsdescribed herein.

The computing device 20 may operate in a networked environment usinglogical connections to remote computing nodes and computer systemsthrough a local area network (LAN). The chipset 24 may includefunctionality for providing network connectivity through a networkinterface controller (NIC) 30, such as a gigabit Ethernet adapter. A NIC30 may be capable of connecting the computing device 20 to othercomputing nodes over a network 32. It should be appreciated thatmultiple NICs 30 may be present in the computing device 20, connectingthe computing device to other types of networks and remote computersystems.

The computing device 20 may be connected to a mass storage device 34that provides non-volatile storage for the computer. The mass storagedevice 34 may store system programs, application programs, other programmodules, and data, which have been described in greater detail herein.The mass storage device 34 may be connected to the computing device 20through a storage controller 36 connected to the chipset 24. The massstorage device 34 may consist of one or more physical storage units. Themass storage device 34 may comprise a management component 38. A storagecontroller 36 may interface with the physical storage units through aserial attached SCSI (SAS) interface, a serial advanced technologyattachment (SATA) interface, a fiber channel (FC) interface, or othertypes of interface for physically connecting and transferring databetween computers and physical storage units.

The computing device 20 may store data on the mass storage device 34 bytransforming the physical state of the physical storage units to reflectthe information being stored. The specific transformation of a physicalstate may depend on various factors and on different implementations ofthis description. Examples of such factors may include, but are notlimited to, the technology used to implement the physical storage unitsand whether the mass storage device 34 is characterized as primary orsecondary storage and the like.

For example, the computing device 20 may store information to the massstorage device 34 by issuing instructions through a storage controller36 to alter the magnetic characteristics of a particular location withina magnetic disk drive unit, the reflective or refractive characteristicsof a particular location in an optical storage unit, or the electricalcharacteristics of a particular capacitor, transistor, or other discretecomponent in a solid-state storage unit. Other transformations ofphysical media are possible without departing from the scope and spiritof the present description, with the foregoing examples provided only tofacilitate this description. The computing device 20 may further readinformation from the mass storage device 34 by detecting the physicalstates or characteristics of one or more particular locations within thephysical storage units.

In addition to the mass storage device 34 described above, the computingdevice 20 may have access to other computer-readable storage media tostore and retrieve information, such as program modules, datastructures, or other data. It should be appreciated by those skilled inthe art that computer-readable storage media may be any available mediathat provides for the storage of non-transitory data and that may beaccessed by the computing device 20.

By way of example and not limitation, computer-readable storage mediamay include volatile and non-volatile, transitory computer-readablestorage media and non-transitory computer-readable storage media, andremovable and non-removable media implemented in any method ortechnology. Computer-readable storage media includes, but is not limitedto, RAM, ROM, erasable programmable ROM (“EPROM”), electrically erasableprogrammable ROM (“EEPROM”), flash memory or other solid-state memorytechnology, compact disc ROM (“CD-ROM”), digital versatile disk (“DVD”),high definition DVD (“HD-DVD”), BLU-RAY, or other optical storage,magnetic cassettes, magnetic tape, magnetic disk storage, other magneticstorage devices, or any other medium that may be used to store thedesired information in a non-transitory fashion.

A mass storage device, such as the mass storage device 34 depicted inFIG. 1, may store an operating system utilized to control the operationof the computing device 20. The operating system may comprise a versionof the LINUX operating system. The operating system may comprise aversion of the WINDOWS SERVER operating system from the MICROSOFTCorporation. According to further aspects, the operating system maycomprise a version of the UNIX operating system. Various mobile phoneoperating systems, such as IOS and ANDROID, may also be utilized. Itshould be appreciated that other operating systems may also be utilized.The mass storage device 34 may store other system or applicationprograms and data utilized by the computing device 20.

The mass storage device 34 or other computer-readable storage media mayalso be encoded with computer-executable instructions, which, whenloaded into the computing device 20, transforms the computing devicefrom a general-purpose computing system into a special-purpose computercapable of implementing the aspects described herein. Thesecomputer-executable instructions transform the computing device 20 byspecifying how the CPU(s) 22 transition between states, as describedabove. The computing device 20 may have access to computer-readablestorage media storing computer-executable instructions, which, whenexecuted by the computing device 20, may perform the methods describedherein.

A computing device, such as the computing device 20 depicted in FIG. 1,may also include an input/output controller 40 for receiving andprocessing input from a number of input devices, such as a keyboard, amouse, a touchpad, a touch screen, an electronic stylus, or another typeof input device. Similarly, an input/output controller 40 may provideoutput to a display, such as a computer monitor, a flat-panel display, adigital projector, a printer, a plotter, or another type of outputdevice. It will be appreciated that the computing device 20 may notinclude all of the components shown in FIG. 1, may include othercomponents that are not explicitly shown in FIG. 1, or may utilize anarchitecture completely different than that shown in FIG. 1.

As described herein, a computing device may be a physical computingdevice, such as the computing device 20 of FIG. 1. A computing node mayalso include a virtual machine host process and one or more virtualmachine instances. Computer-executable instructions may be executed bythe physical hardware of a computing device indirectly throughinterpretation and/or execution of instructions stored and executed inthe context of a virtual machine.

Embodiment 1

Referring to FIG. 2, which is a flowchart illustrating an example methodof audio/video switching in accordance with the present disclosure. Inan embodiment, the method may be applied in a client. The method mayinclude blocks 101˜103.

At block 101, detecting a playing state to obtain a first detectionresult when playing to-be-played content in an audio and videosynchronous playback mode using DASH.

Wherein, DASH (Dynamic Adaptive Streaming over HTTP) is an adaptivebitrate streaming technology, which enables high-quality streaming mediato be delivered on the Internet through traditional HTTP networkservers.

In the embodiment, a playback mode includes an audio and videosynchronous playback mode, an audio and video pause mode and an audioplayback mode. In the audio and video synchronous playback mode, userscan see video content and hear audio content. In the audio and videopause mode, users neither see video content nor hear audio content. Inthe audio playback mode, users can hear audio content but not see videocontent.

At block 102, determining whether switching to the audio playback modeis required according to the first detection result. If yes, the methodgoes to block 103; if no, the method goes to the block 101.

In one exemplary embodiment, as illustrated in FIG. 3, the method mayfurther include blocks 201˜206.

At block 201, detecting whether an audio switching instruction isreceived.

At block 202, when the audio switching instruction is received,determining that switching to the audio playback mode is required.

At block 203, detecting whether a playback interface is minimized.

At block 204, when the playback interface is minimized, determining thatswitching to the audio playback mode is required.

At block 205, detecting whether a playback software of playing theto-be-played content is running in a background.

At block 206, when the playback software is running in the background,determining that switching to the audio playback mode is required.

The audio switching instruction may be triggered by a user. For example,the user can trigger the audio switching instruction by clicking apreset audio play button. When the audio switching instruction triggeredby the user on a web (for example, PC) or a mobile (for example, smartphone) is received, the playback mode is automatically switched to theaudio playback mode.

The playback interface is a display interface for playing streamingmedia content on a web. When the playback interface is minimized by theuser, the playback mode is automatically switched to the audio playbackmode. During a process of playing the audio content and the videocontent synchronously, when the playback interface is minimized by theuser and the playback mode is not switched to the audio and video pausemode, the playback mode is automatically switched to the audio playbackmode to save data traffic and power consumption because the user has ademand for listening to the audio content and has not a demand forwatching the video content.

The playback software is an application software for playing thestreaming media content on a mobile terminal. When the playback softwareis switched by the user to running in the background, the playback modeis automatically switched to the audio playback mode.

In the embodiment, when the streaming media content is played throughthe application software on the mobile terminal, whether the playbackmode is switched to the audio playback mode is determined depends onwhether an audio switching instruction triggered by a user is receivedor depends on whether the playback software is switched from running inthe foreground to running in the background. When the streaming mediacontent is played through a web page on the web, whether the playbackmode is switched to the audio playback mode is determined depends onwhether an audio switching instruction triggered by the user isreceived, or depends on whether the playback interface is minimized bythe user. The embodiment provides a variety of determination methods forswitching to the audio playback mode, which simplifies user operationsand improves user experience.

At block 103, determining a segment number corresponding to contentcurrently being played based on a timestamp of the content, obtainingaudio content of the to-be-played content from a server, and playing theaudio content in the audio playback mode.

In traditional audio and video playback technologies, audio content andvideo content are located in a streaming media file. The client canobtain the streaming media file from the server, and then parses thestreaming media file to obtain the audio content and the video content,and plays the audio content and the video content. In DASH technologies,the server compresses and encapsulates the video content of thestreaming media content to form a video data file, and compresses andencapsulates the audio content of the streaming media content to form anaudio data file. The client needs to obtain the video data file and theaudio data file from the server separately, and performs the audio andvideo playing by parsing the video data file and the audio data file.

In the embodiment, the server stores video data files formed accordingto the video content of the to-be-played content and audio data filesformed according to the audio content of the to-be-played content. Whenperforming audio playing is required, the client transmits a request forobtaining the audio data file to the server and not transmits anyrequest for obtaining the video data file based on a separationcharacteristic between the audio content and the video content in DASH.

In one exemplary embodiment, the block 103 may include blocks A1˜A4.

At block A1, determining a segment number corresponding to contentcurrently being played based on a timestamp of the content, continuingobtaining the audio content corresponding to the segment number from anaudio buffer area and stopping obtaining the video content from a videobuffer area, and performing the audio playback mode on a current mediasegment according to the audio content.

In DASH technologies, the streaming media content may be divided into aplurality of media segments, each media segment corresponds to a presettime length of content (for example, 10 seconds). Each media segmentincludes an audio data file and a video data file. Before playing onemedia segment, the client obtains an audio data file and a video datafile separately corresponding to the media segment from the server, thencaches audio content obtained through parsing the audio data file in theaudio buffer area, and caches video content obtained through parsing thevideo data file in the video buffer area.

In the embodiment, when switching to the audio playback mode is requiredduring a process of playing one media segment of the to-be-playedcontent, the client may disconnect a link of a video stream and maintaina link of an audio stream to continue playing the audio content.

At block A2, transmitting the server with a request for obtaining anaudio data file of a next media segment.

Staring from the next media segment, the client transmits the serverwith requests for obtaining the audio data file of the to-be-playedcontent and does not transmit the server with any request for obtainingthe video data file of the to-be-played content.

At block A3, receiving the audio data file of the to-be-played contentfrom the server, and parsing the audio data file to obtain the audiocontent.

Wherein, the parsing the audio data file to obtain the audio content mayfurther include:

performing a decapsulation operation on the audio data file to obtainaudio stream compression encoded data, and performing a decodingoperation on the audio compression encoded data to obtain the audiocontent.

Decapsulation is also known as demultiplexing. Decapsulation is used toseparate a file with an encapsulation format (for example, AVI format,PM4 format, FLU format) into audio stream compression encoded data orvideo stream compression encoded data. Decoding is a process ofrecovering compression data into an audio signal and a video signalperformed by a decoder.

At block A4, performing the audio playback mode on the next mediasegment according to the audio content.

In the embodiment, when the audio playback mode is required, the clientplays the current media segment of a buffered audio content in the audioplayback mode and obtains only audio data files starting from the nextmedia segment until the playback mode is switched to the audio and videosynchronous playback mode or the playback mode is switched to the audioand video pause mode.

In one exemplary embodiment, as illustrated in FIG. 4, the method mayfurther include block 301 and block 302.

At block 301, when an audio language switching instruction is receivedduring a process of playing the to-be-played content in the audioplayback mode, obtaining the audio content corresponding to a languagefrom the server according to the audio language switching instruction.

At block 302, playing the audio content corresponding to the language.

For example, the to-be-played content is an American movie with twolanguages, for example, English and Chinese. During the process ofplaying the to-be-played content in the audio playback mode, English orChinese can be chosen according to the user's selection.

In one exemplary embodiment, as illustrated in FIG. 5, the method mayfurther include block 401 and block 402.

At block 401, when an audio quality switching instruction is receivedduring a process of playing the to-be-played content in the audioplayback mode, obtaining the audio content corresponding to audioquality from the server according to the audio quality switchinginstruction.

At block 402, playing the audio content corresponding to the audioquality.

For example, the audio content of the to-be-played content has threeaudio quality to be chosen from, for example, standard audio quality,high audio quality, and lossless audio quality. The audio contentcorresponding to be audio quality can be chosen according to the user'sselection during the process of playing the to-be-played content in theaudio playback mode.

In one exemplary embodiment, please refer to FIG. 6, which is aflowchart illustrating another example method for audio/video switchingin accordance with the present disclosure. The method may include blocks501˜504.

At block 501, detecting a playing state to obtain a second detectionresult, during a process of playing the to-be-played content in theaudio playback mode.

At block 502, determining whether switching to the audio and videosynchronous playback mode is required according to the second detectionresult.

In one exemplary embodiment, as illustrated in FIG. 7, the method mayfurther include blocks 601˜606.

At block 601, detecting whether an audio and video switching instructionis received.

At block 602, determining that switching to the audio and videosynchronous playback mode is required in response to a detection thatthe audio and video switching instruction is received.

At block 603, detecting whether a playback interface minimization iscanceled.

At block 604, determining that switching to the audio and videosynchronous playback mode is required in response to a detection thatthe playback interface minimization is canceled.

At block 605, detecting whether a playback software is switched fromrunning in a background to running in a foreground.

At block 606, determining that switching to the audio and videosynchronous playback mode is required in response to a detection thatthe playback software is running in the foreground.

The audio and video switching instruction may be triggered by a user.For example, the user can trigger the audio and video switchinginstruction by clicking a preset play button. When the audio and videoswitching instruction triggered by the user on a web (for example, PC)or a mobile terminal (for example, smartphone) is received, the playbackmode is switched to the audio and video synchronous playback modeautomatically.

The playback interface is a display interface for playing streamingmedia content on a web. When the playback interface minimization iscanceled by the user, the playback mode is automatically switched to theaudio content and video content synchronous playback mode. The playbacksoftware is an application software to play the streaming media contenton the mobile terminal. When the playback software is switched torunning in the foreground by the user, the playback mode is switched tothe audio and video synchronous playback mode automatically.

In the embodiment, when playing the streaming media content through theapplication software on the mobile terminal, the client determineswhether the playback mode is switched to the audio and video synchronousplayback mode depends on whether an audio and video switchinginstruction triggered by a user is received or depends on whether theplayback software is switched from running in the background to runningin the foreground. When playing the streaming media content through aweb page on the web, the client determines whether the playback mode isswitched to the audio and video synchronous playback mode depends onwhether an audio and video switching instruction triggered by the useris received or depends on whether the playback interface minimization iscanceled by the user.

At block 503, determining a segment number corresponding to audiocontent currently being played based on a timestamp of the audiocontent, obtaining audio content and video content of the to-be-playedcontent from the server simultaneously in response to a determination ofswitching to the audio and video synchronous playback mode.

In one exemplary embodiment, the block 503 may further include blocksB1˜B3.

At block B1, obtaining a timestamp of audio content of the current mediasegment, and determining a segment number of the audio content accordingto the timestamp;

At block B2, sending a request to the server to obtain an audio datafile corresponding to the segment number and a video data filecorresponding to the segment number;

At block B3, receiving the audio data file and the video data file fromthe server, parsing the audio data file to obtain the audio content, andparsing the video data file to obtain the video content.

In the embodiment, the to-be-played content is divided into multiplemedia segments, each media segment corresponds to a segment number. Eachsegment number corresponds to an audio data file and a video data file.

In one exemplary embodiment, the block B3 may include block C1 and C2.

At block C1, performing a decapsulation operation on the audio data fileto obtain audio stream compression encoded data, and performing adecoding operation on the audio compression encoded data to obtain theaudio content.

At block C2, performing a decapsulation operation on the video data fileto obtain video stream compression encoded data, and performing adecoding operation on the video compression encoded data to obtain thevideo content.

At block 504, playing the content according to the audio content and thevideo content in the audio and video synchronous playback mode.

In the embodiment, during the process of playing the to-be-playedcontent in the audio playback mode, whether the audio playback modeswitches to the audio and video synchronous playback mode, depends on auser instruction or a user operation is received. When it is necessaryto switch the audio playback mode to the audio and video synchronousplayback mode, according to the audio content, the video content andaudio content of the current media segment are reacquired from theserver to replay the current media segment in the audio and videosynchronization playback mode.

Embodiment 2

FIG. 8 is a block diagram of program modules of an apparatus foraudio/video switching in accordance with the present disclosure. Theapparatus may be partitioned into one or more program modules which arestored in a storage medium and executed by one or more processors tocomplete the embodiments of the present application. The program modulein the embodiment of the present application refers to a series ofcomputer program instruction segments capable of performing specificfunctions and is more suitable for describing execution process of thedata writing system in the storage medium than the program itself. Thefollowing specifically describes functions of the program modules in theembodiment.

As shown in FIG. 8, the apparatus applied in a client may include adetecting module 401, a determining module 402 and a processing module403, wherein:

The detecting module 401 is configured to detect a playing state toobtain a first detection result when playing to-be-played content in anaudio and video synchronous playback mode using DASH.

The determining module 402 is configured to determine whether switchingto an audio playback mode is required according to the first detectionresult.

In one exemplary embodiment, the detecting module 401 is furtherconfigured to detect whether an audio switching instruction is received.The determining module 402 is configured to determine that switching tothe audio playback mode is required, in response to a detection that theaudio switching instruction is received.

In one exemplary embodiment, the detecting module 401 is furtherconfigured to detect whether a playback interface is minimized. Thedetermining module 402 is configured to determine that switching to theaudio playback mode is required, in response to a detection that theplayback interface is minimized.

In one exemplary embodiment, the detecting module 401 is furtherconfigured to detect whether a playback software of playing theto-be-played content is switched to running in a background. Thedetermining module 402 is configured to determine that switching to theaudio playback mode is required, in response to a detection that theplayback software is switched to running in the background.

The processing module 403 is configured to obtain audio content of theto-be-played content in response to a determination of switching to theaudio playback mode and play the audio content in the audio playbackmode.

In one exemplary embodiment, the apparatus may further include a firstswitching module and a second switching module.

The first switching module is configured to obtain the audio contentcorresponding to a language in response to an audio language switchinginstruction during a process of playing the to-be-played content in theaudio playback mode and play the audio content corresponding to thelanguage.

The second switching module is configured to obtain the audio contentcorresponding to audio quality in response to the audio qualityswitching instruction during a process of playing the to-be-playedcontent in the audio playback mode and play the audio contentcorresponding to the audio quality.

In one exemplary embodiment, during a process of playing theto-be-played content in the audio playback mode, the detecting module401 is further configured to detect a playing state to obtain a seconddetection result. The determining module 402 is configured to determineto switch to the audio and video synchronous playback mode is requiredaccording to the second detection result.

In one exemplary embodiment, during a process of playing theto-be-played content in the audio playback mode, the detecting module401 is further configured to detect whether an audio and video switchinginstruction is received. The determining module 402 is configured todetermine that switching to the audio and video synchronous playbackmode is required in response to a detection that the audio and videoswitching instruction is received.

In one exemplary embodiment, during a process of playing theto-be-played content in the audio playback mode, the detecting module401 is further configured to detect whether a playback interfaceminimization is canceled. The determining module 402 is configured todetermine that switching to the audio and video synchronous playbackmode is required, in response to a detection that the playback interfaceminimization is canceled.

In one exemplary embodiment, during a process of playing theto-be-played content in the audio playback mode, the detecting module401 is further configured to detect whether a playback software ofplaying the to-be-played content is switched from running in abackground to running in a foreground. The determining module 402 isconfigured to determine that switching to the audio and videosynchronous playback mode is required, in response to a detection thatthe playback software is switched from running in the background torunning in the foreground.

In one exemplary embodiment, during a process of playing theto-be-played content in the audio playback mode, the processing module403 is configured to obtain the audio content of the to-be-playedcontent and video content of the to-be-played content simultaneously inresponse to a determination of switching to the audio and videosynchronous playback mode and play the audio content and the videocontent in the audio and video synchronous playback mode.

What is claimed is:
 1. A computer-implemented method of switchingbetween different playback modes, comprising: detecting a state ofplaying a content item using Dynamic Adaptive Streaming over HTTP(DASH), the content item comprising a plurality of segments; determiningwhether there is a need of switching between a first playback mode and asecond playback mode based on the detected state of playing the contentitem, the first playback mode being a mode of synchronously playingvideo and audio of the content item, and a second playback mode being amode of playing the audio of the content item only; determining asegment number based on a timestamp of content currently being playedand corresponding to a segment among the plurality of segments inresponse to determining that there is the need of switching between thefirst playback mode and the second playback mode; obtaining content ofthe content item based at least in part on the segment number and aplayback mode to be switched to; playing the content item in a switchedplayback mode; wherein the determining whether there is a need ofswitching between a first playback mode and a second playback mode basedon the detected state of playing the content item further comprises:determining to switch from the first playback mode to the secondplayback mode in response to detecting that a playback interface ofplaying the content item is minimized, detecting that a playbacksoftware of playing the content item is running in a background, ordetecting that an instruction of switching from the first playback modeto the second playback mode is received; wherein thecomputer-implemented method further comprises: obtaining content of anaudio segment corresponding to the segment number from a local bufferarea; and sending a server requests for subsequent audio segments onlyuntil switching to the first playback mode or a pause mode.
 2. Thecomputer-implemented method of claim 1, wherein the detecting a state ofplaying a content item further comprises at least one of: detectingwhether a playback interface of playing the content item is minimized;detecting whether a playback software of playing the content item isrunning in a background or a foreground; and detecting whether aninstruction of switching between a first playback mode and a secondplayback mode is received.
 3. The computer-implemented method of claim1, further comprising: receiving the subsequent audio segments from theserver; performing decapsulation operations and decoding operations onthe subsequent audio segments; and playing the content item in thesecond playback mode.
 4. The computer-implemented method of claim 3,further comprising: obtaining audio segments corresponding to a languagein response to receiving an instruction of specifying the language; andplaying the content item in the language.
 5. The computer-implementedmethod of claim 3, further comprising: obtaining audio segmentscorresponding to an audio quality in response to receiving aninstruction of specifying the audio quality; and playing the contentitem with the audio quality.
 6. The computer-implemented method of claim1, wherein the determining whether there is a need of switching betweena first playback mode and a second playback mode based on the detectedstate of playing the content item further comprises: determining toswitch from the second playback mode to the first playback mode inresponse to detecting that a minimization of a playback interface ofplaying the content item is cancelled, detecting that a playbacksoftware of playing the content item is running in a foreground, ordetecting that an instruction of switching from the second playback modeto the first playback mode is received.
 7. The computer-implementedmethod of claim 6, further comprising: sending a request for a videosegment and an audio segment corresponding to the segment number andsubsequent requests for subsequent video segments and audio segmentsuntil switching to the second playback mode or a pause mode; andreceiving the video segment and the audio segment corresponding to thesegment number and the subsequent video segments and audio segments. 8.The computer-implemented method of claim 7, further comprising:performing decapsulation operations and decoding operations on thereceived video segments and audio segments; and playing the content itemin the first playback mode.
 9. A computing device of switching betweendifferent playback modes, comprising: at least one processor; and atleast one memory communicatively coupled to the at least one processorand storing instructions that upon execution by the at least oneprocessor cause the computing device to: detect a state of playing acontent item using Dynamic Adaptive Streaming over HTTP (DASH), thecontent item comprising a plurality of segments; determine whether thereis a need of switching between a first playback mode and a secondplayback mode based on the detected state of playing the content item,the first playback mode being a mode of synchronously playing video andaudio of the content item, and a second playback mode being a mode ofplaying the audio of the content item only; determine a segment numberbased on a timestamp of content currently being played and correspondingto a segment among the plurality of segments in response to determiningthat there is the need of switching between the first playback mode andthe second playback mode; obtain content of the content item based atleast in part on the segment number and a playback mode to be switchedto; play the content item in a switched playback mode; wherein the atleast one memory further stores instructions that upon execution by theat least one processor cause the computing device to: determine toswitch from the first playback mode to the second playback mode inresponse to detecting that a playback interface of playing the contentitem is minimized, detecting that a playback software of playing thecontent item is running in a background, or detecting that aninstruction of switching from the first playback mode to the secondplayback mode is received; obtain content of an audio segmentcorresponding to the segment number from a local buffer area; and send aserver requests for subsequent audio segments only until switching tothe first playback mode or a pause mode.
 10. The computing device ofclaim 9, the at least one memory further stores instructions that uponexecution by the at least one processor cause the computing device to:receive the subsequent audio segments from the server; performdecapsulation operations and decoding operations on the subsequent audiosegments; and play the content item in the second playback mode.
 11. Thecomputing device of claim 10, the at least one memory further storesinstructions that upon execution by the at least one processor cause thecomputing device to: obtain audio segments corresponding to a languagein response to receiving an instruction of specifying the language; andplay the content item in the language.
 12. The computing device of claim10, the at least one memory further stores instructions that uponexecution by the at least one processor cause the computing device to:obtain audio segments corresponding to an audio quality in response toreceiving an instruction of specifying the audio quality; and play thecontent item with the audio quality.
 13. The computing device of claim9, the at least one memory further stores instructions that uponexecution by the at least one processor cause the computing device to:determine to switch from the second playback mode to the first playbackmode in response to detecting that a minimization of a playbackinterface of playing the content item is cancelled, detecting that aplayback software of playing the content item is running in aforeground, or detecting that an instruction of switching from thesecond playback mode to the first playback mode is received.
 14. Thecomputing device of claim 13, the at least one memory further storesinstructions that upon execution by the at least one processor cause thecomputing device to: send a request for a video segment and an audiosegment corresponding to the segment number and subsequent requests forsubsequent video segments and audio segments until switching to thesecond playback mode or a pause mode; and receive the video segment andthe audio segment corresponding to the segment number and the subsequentvideo segments and audio segments.
 15. A non-transitorycomputer-readable storage medium, storing computer-readable instructionsthat upon execution by a processor cause the processor to performoperations, the operations comprising: detecting a state of playing acontent item using Dynamic Adaptive Streaming over HTTP (DASH), thecontent item comprising a plurality of segments; determining whetherthere is a need of switching between a first playback mode and a secondplayback mode based on the detected state of playing the content item,the first playback mode being a mode of synchronously playing video andaudio of the content item, and a second playback mode being a mode ofplaying the audio of the content item only; determining a segment numberbased on a timestamp of content currently being played and correspondingto a segment among the plurality of segments in response to determiningthat there is the need of switching between the first playback mode andthe second playback mode; obtaining content of the content item based atleast in part on the segment number and a playback mode to be switchedto; playing the content item in a switched playback mode; wherein theoperations further comprise: determining to switch from the firstplayback mode to the second playback mode in response to detecting thata playback interface of playing the content item is minimized, detectingthat a playback software of playing the content item is running in abackground, or detecting that an instruction of switching from the firstplayback mode to the second playback mode is received; obtaining contentof an audio segment corresponding to the segment number from a localbuffer area; and sending a server requests for subsequent audio segmentsonly until switching to the first playback mode or a pause mode.
 16. Thenon-transitory computer-readable storage medium of claim 15, wherein thedetecting a state of playing a content item further comprises at leastone of: detecting whether a playback interface of playing the contentitem is minimized; detecting whether a playback software of playing thecontent item is running in a background or a foreground; and detectingwhether an instruction of switching between a first playback mode and asecond playback mode is received.